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ABSTRACT 
FOR 

A STUDY OF LONGITUDINAL CAUSAL MODELS 
COMPARING GAIN SCORE ANALYSIS WITH 
STRUCTURAL LQUATION APPROACHES 



- The' logic ot using a 'gain score approach versus ■ longitudinal' causal models 

is studied >\n- tins secondary analysis'of a complex* data base. The yam score, 

model used by the Federal Reserve Bank and the School District of Philadelphia 

in tneir "WHAT WORKS IN READING?" study is successively refined using the LISRtL 

structural equation' program. First,' the Philadelphia data base is described and 

• * 
tnen difficulties of using gain score models are discussed. 

« 

Regression estimates Of tfie different models are described. Procedures 
dealing with identification, specification and col 1 inearity are exemplified. A 
sensitivity analysis of measurement and specification error shows the degree' to 
wmcn estimated parameters are affected by researchers '. assumptions . The 1 reana- 
lysis shows improvements .in the understanding of achievement test data and the 
logic of how'to' analyze dfta' bases with longitudinal dependent variables. * _ 



9 

ERIC 



/ 



J 



TABLE -OF CONTENTS 



INTRODUCTION ...»...': 

DATA ANALYSIS * . . 

Gain Score Model Used by iPhi 1 adel phi a Researchers 

A Structural Equation Model of the Data . . . 

Analysis and Discussion of Models 1 and 2 . . 

A STRUCTURAL EQUATION MODEL WITH NON-ZERO 
ERROR ASSUMPTIONS. 

discussion of Model 4 , 

Goodness of Fit and Longitudinal Assumptions. . 



Comparison of Models -2 and 4 

THE ANALYSIS OF MEASUREMENT AND SPECIFICATION ERROR 

Results of the Sensitivity Analysis. . . . 

Results of Testing a Model with Realistic 

E rror Assumpti on • . 

CONCLUDING COMMENTS. ' \ . . . 

REFERENCES 



n 



LIST OF TABLES AND FIGURES 

V - ^ ' 

f ABLE 1 Code Names, Definitions, Means and Standard Deviations 
for Eleven Independent Variables aind Three 
dependent Variables • 4 

TABLE 2 Correlation Matrix for 14 Variables in Philadelphia 

- Achievement Study 5 

jViLE^3 Vanance-Covariance Matrix of Gain Score With the 

Two Achievement Scores % -a . % ' 

MODEL 1 Gain Score • ...... • 8 

MODEL 2 A Longitudinal H6rel . \* •■ • • * • • 11 

TABLE 4A Lisrel Estimates of Model 1 and Model 2 15 

TABLE 4B Lisrel Estimates of Model' 1 and Model 2 With 

Insignificant P'aths Suppressed . . 17, 

MODEL .-3 Independent Measurement Model . . 23 

MODEL 4 The Full Model ' • • ♦ * • • - • • 25 

TABLE 5 'Comparison .of Model 2 With Modet 4 .28 

TABLE 6" Resulting Parameter Estimates Given fcrtar -Assumptions- . . 33 



ERIC 



iH 



INTRODUCTION 



Current trends in applied research have witnessed widespread adaptation ■ 
of multiple regression techniques to program evaluations. Whi ]e regression 
analysis* is a powerful technique, it owes much of its power to hi ghly' restri c- 
t^ive and often unrealistic assumptions. The interpretation of regression 
results, especially the assessment of the relative impact or importance of 
independent variables, can be treacherous. There is no assurance that con- 
sumers of aoolied research, in this case school administrators, educational 
researchers, teachers, or politicians will understand its limitations. 

This paper compares methodological procedures us^ to analyze longitudi- 
nal data. It critically compares the use of gain scores to structural equation 
approaches.- The analytic techniques discussed here are applicable to any 
longitudinal analysis. These general techniques are exemplified in the secon- 
dary analysis of data from the "WHAT WORKS IN READING " study conducted by the 
School District of Philadelphia. 

The data base examined in this study is the result of a joint effort in 
1 977 and 1978 by the^ederal Reserve Bank and the School District of Phi la- 
delphia to study factors affecting the reading achievement of 1,800 elementary 
school children., Approximately 8,000 copies of their report "WHAT WORKS IN 
READING" and its summary have been distributed throughout the Vjjorld (Kean, 
Summers, Raivetz and Farber, 1979). 

Philadelphia data are reanalyzed using LISREL. LISREL provides an extraor- 
dinarily flexible framework for parameter estimation of complex models, and is 
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well adapted to a wide variety of models, # including recursive or non-recursive, 

as well as models incorporating latent structures (Jores v kog and Sorbom 1981). 

Following an introduction to the "data 'base, the analysis proceeds in three. 

steps. First, specification of the dependent variable is examined. The origi- 

nal report (Kean et al., 1979 ) treated reading improvement as a net change or' 

gain score. Results of using, the gain Score as a dependent variable are .com- . 

pared tt> result^ when reading at time one and time two are treated as separate 

dependent variables in a longitudinal model, (see Models 1 and 2). + 

Second, eleven independent variables are re-examirfed to incorporate a 

t 

latent, KKfactor structure and the results of this analysis are compared to 
the previous results. This analysis exemplifies the use of a factor structure 
to control for col linearity. Third, the 10 factor model' is subjected ^o a sen- 
sitivity analysis (see Land and Felson, 1978) with regard to random measurement 
error in the dependent variables, and to' speci f i cati on error' due to the omission 
of theoretically important independent variables. This, analysis demonstrates 
how Sfnall changes in model specification and residual assumptions can modify 
results. 

Information oh 25 reading teache'rs, 25 principals, 94 teachers, 68 read- % 
mg aides, and the 1 ,800 students yielded 245 variables which were analyzed 

\ 

by Philadelphia researchers using multiple regression techniques. The sample 
selection process was done by school. Average Total Reading Achievement 
Development Scale Scores (ADSS) on the H Ca] i f orm a Achievement Tests (CAT -70) 
for 1974 and 1975 were summed over grades 1-4. The 190 schools studied were , 
ranked on the difference of these sums. The final sample contained ten 
'^high-hi.gh* , five "middle-middle", and ten "low-low" schools which gave repre- 
sentation from al-l eight administrative sub-di-stri cts of* the city. The sample 



totaled 25 school's. The students in these schools were representative of stu- 
dents in schools haviny high, middle, and low success in reading achievement. 
Data collection procedures included interviews with school personnel (e.g., 
' principals, teachers, reading aides) and recordi ng 'data from pupil records. 
The data collection process was completed in two weeks, (Kean, Summers^, Raivetz 
. and Parker, 1979). 

.Over 500 multiple regression runs were conducted to establish which of 

T 

the 245 variables measured had the most impact on the/ reading achievement 
gam. Eighteen variables were identified as contributing to achievement gain. 
f "MAT WORKS IN READING?" points out the difficulties of analyzing this complex 

data base without a theory. The Philadelphia technical report consists 
almost entirely of ' descriptions of variables and correlation matrices. Cross- 
tabulations, path analysis, or modeling of the 245 variables were not under- 
taken, and neither the relations among the w 18 variables nor their impaction 
third and fourth grade test scores were analyzed. 

As mentioned above, "WHAT WORKS IN READING?" found 18 of the 245 variables 
studied to have a statistically significant beta weight in predicting the gain 
score. Table 1 lists definitions, means, and 'standard deviations, for eleven 
of the independent variables, and .for the three dependent variables: the gain 
score, and the third and fourth grade reading scores-. The eleven independent 
variables include measures of student, teacher, and school organization. 

Table 2 shows the 14x14 correlation matrix of the variables listed in 
Tatfle 1. The impression obtained frpm Table'2 is that the matrix isr thin. Of 

the 90 correlations in it, 17 or 19% are gre'ater than .15, and 12 or 13% are 

/ • ^ 

greater than y .25. Table 2 contains 55 correlations among pairs of the 11 

independent variables and 5, or 9%, are greater than .25. The highest corre- 
lation of any variable with CATGAIN, the gain score, is .08. ' 

* 

ERIC 8 . 



4 TABLE 1 

Code Names, Definitions, Means, and Standard Deviations for Eleven Independent'^ 
^Variables and Three Dependent Variables 






Code Names 


Definitions 
£ 




Mean 


sttWd 
Deviate 






Difference between Grade 3 and Grade 4 scale score 

s 

California Achievement Test—Reading Comprehension 
Scale Score for Grade 3, 1975 




28.43 


52.50 




»* \ 

T, 




385.06 


'67.74 




'. x » 


Days Students were present 




130.51 


10.41 






Student attended kindergarten 1=N0, 2=YES 




1.80* . 


0.40 


t 


h 


Number of npn-teaching supportive staff per school 




* 


11.0£> 






Percent of students scori ng* above 84th percentile 
California Achievement Test 1976--Total fleadwig 




.20 


.13 



Xs 

x, 

X 

X 1 rt 



Percent of classroom teachers wvth less than 2 years experience 

Number of tea'Chsr pay periods with no absence 13.89 3.79 

Teacher attends outside professional conference meetings 1.17 .39 
i=N0, 2*YES. ' 

First year teachingjrade 4 1=N0, 2=Y£S 1.17 .35 

Minutes per week of individual independent reading 73.35 60.31 

Teacher would select the same reading program again 1.54 .50 

Times per week aide in room during reading 2.55 > 2.31 

California Achievement Test, Reading Comprehension 412.50 72.56 
Scale Score for Grade 4, 1976 



'Means for *J w ere not shown in the November 1979 technical report of 
M «MAT WORKS IN READING?" 
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TABLE 2 

CORRELATION MATRIX FOR 14 VARIABLES IN PHILADELPHIA ACHIEVEMENT 

STUDY, N 1 ;363 

\ 



betwff" Or J 
4 Or 4 scale 
score 

T, • U 
1 .000 



CAT R«r*d Dlys 
Corprthfn Student 
Sc*i e Score Present 



Student 
Attcnj 
K.1 ndergtn 



# of Non- 
teicning 
Support 
Staff 

• X. 



t Student* 
Above HUh 
Percent i le 
CAT - 19/b 



t lUss 
Tenners 
with Less 
Than 2 y r. 
t xpenence 



# Teicher Teller lit yeir 
Ply Periods attends teaching 

with no outside CR. 4 

Absence conference 

x» X 



(i\n per Teacher 

wee* select 
Individ. sane • 

independ. reading 

reading progrjn 



Aide ti«e 
dun ng 
reiAi ng 
cjcH weed 



CAT. Reading 
Conp. <>c*te 
Score -1974 
Cride 4 



T, * 


-.392 


* 1 000 






















V 




.074 


.161 


1.000* 
























- .020 


123 


.116 


1.000 








* 








7 


* ? ) 


V 


-.004 
-.051 


-.290 
,386 


- 134 
134 


.141 


1.000 
-.626 


1 1.000 


















.on 


- na 


..056 


-.095 


.32 7 


-.154' 


'\ .000 
















.042 


076 


-.017 


..006 


.135 


-.040 


o/e 


1 .000 














-.03? 


-.112 


-.033 


.004 


-.027 


-.082 


.118 


.005 


1. 000 








♦ 


*. 


-.00?/' 


-.139 


-.061 


.013 


.104 


.007 


.021 - 


.021 


.021 


1 .000 - 










083 


.090 


.086 


.034 , 


-.143 


.113 


.005 


-.004 


-.125 


,021 


1.000 








-.007 


,187 


-.029 


-.006 


-.142 


.030 


' -.033 , 

-* 


.080 


.023 


-.011 


-.1191 


1.000 






.0(14 


-.362 


-.116 


-.110 


.527 


-.427 


.074 1 


.010 


.2S4 


{.0X2 


-.069 


-.106 


1.000 


Ti 


H . A 


.722 


.197 


.129 


-.273 


.382 


-.065 


* .IQ1 


-.132 


-.132 


.144 


.169 


.335 1.000 
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« ' DATA ANALYSIS 

Gam Scqre_Mgdej_jj secl by Philad el phia Researchers . ' 

The regression runs done by Federal Reserve Bank economists used the 
difference between the third "and fourth 'grade reading' achi evement scores -as 

. a single dependent variable. The use of "difference", "change", or "gain" 
scores has been thoroughly examined (Thorndike and Hagen, 1955; Thorndike, 
l%b; Bohrnstedt, 1969 ; Cronbach and Furby, 1970; Alwin and Sull>van, 1975; Kim 

' and Mueller,, 1976; Kessler, 1977 ; Pendleton, Warren and Chang, 1979). These 
examinations ha ve, general ly advi sed> agai nst using gain scores because: the dif- 

M 

'terence between the two measures has lower reliability than the measures con- 
sidered separately; their use requires low error variance and high reliability 

•in variables; the calculation ofthe gain score reliability tends to be un- 
stable because ft depends on five other values — the three correlations and two 
variances; and the analysis of gain scores is complicated by the effects of 
regress i cm toward the mean. 

Similarly, Thorndike (1963:40, 1966:124), points out two characteristics 
of using gain scores. First, the gain score is almost certain to be nega- 
tively correlated with the initial achievement score. Second, the variance of 
the gain* scores is in some cases no more than one-fourth the size of- the 
variance of the Time 1 and Time 2 scores. Table 3 illustrates Thorndike's 
comments. Despite these disadvantages, gain scores continue to be used in 
applied research find evaluation work (Alwin and Sullivan, 1 975) ♦ 

Table 3 shows the vari ance-covari ance matrix of the gain score and the third 
and fourth grade achievement scores. Thorndike's comments (see Thorndike, 



TABLE 3 

Vanance-vCovariance Matrix of Gain Score with the Two Achievement Scores 

• N » 1 ,36*3 



J.AI '.SCORE 



GAI NSCURE 



2,756:25, 



GRADE 3 
READING' SCORE 



GRADE 4 
READING SCORE 



GRADE 3 
READING SCORE 



•1,039.52 



4 -,588 .71 



GRADi\4 
yr-Tl£SDlMS SCORE 



Not calculated in y/ 3,548.78 
PTfmdelphi.a -Study * * 



5,264.95 




Student's attendance 
at school 



Student went to kindergarten 



Ratio of non-teachjng staff 
'-VJ§>. students 



Proportion of high achieving 
students in'school 



Proportion of new teachers 
'in school 



Number of absences of 
fourth grade teacher 



Attendance of fourth grade 
teacher at outsitte 
conferences 



Experience of fourth 
[ grade teachers 



Number of minutes students 
spend reading independently 



Teacher would select same 
reading program'again 



Hours per week of classroom 
reading aide Support 




> 

Model 1: 

A Gain Score Model 

No time assumptions -ail independent variables are 
assumed to 'affect a single dependent variable. All error 
terms are assumed to be zero. * ' 
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1963:40, 1966:124) are appropriate herfe. As anticipated, the gain scpre is 
negatively correlated with the initial score -and the covariance of the gain 
score is 60% the size of the Time 1 variance and 52% the size of the. Time 2 
variance. • " 

An additional problem with the gain score is' that it obliterates infor- 
mation. For example, in analysis of reading achievement scores, it could be 
hypothesized that reading level and Reacting i g a i n are associated. However, 
computation of a gain'score eliminates data on t,he student's reading] level at 
either tune. In this case, for example, we find a negative correlation J>et- 
w^en third )jrade readiny level and re^adin^gain from grade three to grade 

four, indicating that lower students tended to' improve more rapidly than 

<* 

.higher level students. Consequently, factors associated with high gain may 

• 

also contribute to low overall achievement. This makes analysis of a gain 
score difficult to interpret. 

Model f shows the ,ga i n score model. All 11 independent variables are 
assumed to influence the gain score and are assumed to be measured without 
error. The gain score model was analyzed using ttfe just-identified multiple 
regression option of LISREL. " 

A Structural Equation Model of the Data . . — — 

It j..s theoretically reasonable that the gain score analysis can be 
extended by considering alternative models that use data from both times 
rather than using the difference score as a single dependent- vari able. The 
analysis below used the maximum likelihood exploratory factor analysis (EFAP) 
and structural equation programs, LISREL IV and V, of Joreskog and Sorbom . 
(1981). 



Maximum likelih&od estimation procedures, originally developed by the 
British statistician R. A. Fisher (1921), yield estimations which are efficient 
ana consistent for large samples. • These approaches were introduced to * ^ 

sociologists l'n the middle 1970's (Hauser and Goldberger, 1971; Burt, -1973). 
Two-, three-, and- four-wave multi-variable model's have been extensively 
stud.ied with these approaches (See, 1 for example, Duncan, 1972; Hannan and 
Young, 1977; Hargens, Reskin and Allison, 1976; .Long, 1976; Joceskog and 
Sdrbom, "1977 ; and Wheaton, Mutheiri-, Alwin and Summers, 1977). 
■ Model 2 shows one alternative model for analyzing the Philadelphia data* 
using the two dependent variables-, Time 1 and Time 2, together instead of ana--_ 
lyzin'g their difference. Two multiple regression runs were ma d| using the 11 
variables; first the third grade achievement variable was used a^ the depen- 
dent variable then, the fourth grade variable was used. ' • 

A review of "WHAT WORKS IN READING?" and conversations with School District 
of Philadelphia research and evaluation staff showed that 3 of the 11 vari- 
ables can be hypothesized to influence both 'the third and fourth grade jc.ores 
while the other 8 can be hypothesized to influence only the fourth grade 
score. The 3 variables influencing scores at both times were whether t^e stu- 
dent went to kindergarten ( ■> ), and the proportion of students in the school 
scoring well on the achievement test ( \„ ), and the proportion of new 

teachers ( <. ). / 

Model 2 contains two structural equations. The first uses only the inde- 
pendent variables affecting the third grade score. The second uses all 11 
independent variables plus the third grade score. In this model, error terms 
are- also assumed to have expectations of 0. 
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' Student's attendance 
at school 



Student went to kindergarten 



Ratio of non-tfeaching*staff' 
to students 



Proportion of high achieving 
- students in school 



Proportion of new teachers 
in school 



Number of absences of 
fourth grade teacher 



Attendance -of fourth grade 
teacher at outside 
conferences 



Experience of fourth 
grade teachers 



Number of minutes students 
spend reading independently 



Teacher would select same 
reading program again 



Hours per week of classroom 
* • reading aide support 




Model 2: 

A Longitudinal Model 

Three independent variables are assumed to affect Time 1, 
the third grade sgore.'AII yariables and the third grade 
score are assumed to affect Time 2, the fourth grade score. 
All error terms are assumed to be zero, n 



The L1SREL IV computer program was used to analyze the eovanarice matrix. 
Briefly, the LISKEL I'lOdel consists of two parts: the measurement mode/ and the 

structural .model. The measurement mo v det specifies how latent or hypothetical 

* / ' 

constructs 3ne measured in terms of observed variables. There are /two measure- 
ment models, one for dependent variables, and one for independent /van abl es. 



* Let x 



) be a vector of observed independent variables and let, 



y r u 1 > y > 

depend£«t valuables. Then, 



y^ ) be' a vector of observed „ 



\ 



(i)- * 



where " and ■ are* random vectors of latent independent and dependent 
vari ables, = ( ; 
The vectors and are errors, of measurement in y, and x respectively when 



) and rj\= (rj. , n ### n ) respectively. 

m 



y aqd x are measured as deviations froin*the.ir means. The matrices A (q x n) 

x 

and ' (p x m) are regression matrices of (x on \ and y on n ) respectively. 
The structural jnodel linking the two measurement models is given in (3). 

■ •>. • ■ 1 '. 

= : > + (3) 



Where ; and are coefficient matrices and 



m 



) is a 



random vector of residuals reflecting disturbance terms or errors in.equa- 
tio(iS ( (Joreskocj and Sorbom, 1978). 
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■ The two equationj'compnsi ng the moael in Figure 2 are given in (4) 
and (5). 

' 1 1 = if + \ (*)« 

* 

-v* i r i> + r i = r \ f, + ; - (5) 

' A the third grade score is seen to be comprised of two parrts, r c, 
whfch represents ttje^effects of the three independent factors and q } which is - 
the residual variance unexplained by the three- variables affecting the third 

yrade score. Equation (5) can also be rewritten as (6). 

n > t 

•' * ; ■ . '(6) 

J the fourth yrade score is seen to be comprised of three- parts: 

a. '* * . * 

M which represents the'effect of r <i on n 2 ; r 2 £ which is^the effect ' „ 

of trie independent variables; and " ? whicn is the unexplained residual 

van ance of . 

This model is recursive in that the fourth 'grade score is assumed to have' 
no effect oh the tnird grade score. Identi f rcati on problems in recursive 
inoaels have Deen frequently commented 1 on (Heisfe,.1969, 1 97Q) . In order to 

* 7 

ma*e recursive models identifiable, restrictive assumptions are generally made 
aoout error terms. For example, in order to identify the Model 2 the usual, 
procedure is to .assume that all error terms and coyariances of error. terms are 
zero and uncorrelated with each other. This includes assuming the covariance 
of the error terms, £ x and c 2 ir* Equations (4) and (5) is equal^to zero. 

■ ' -13- 
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The assumption of zero error terms and zero error covanances is equivalent 
tu assuming that all variables are measured without error* On the independent 
variable side of the model, in LISKEL terminology, this situation is called 

"fixed X"; * is assumed to be an identity matrix, the independent variable 

x 

error matrix ^ = 0 and x= . Similar assumptions are made on the dependent 
variable side, i.e., \ = I , 1 = 0 and y = n . .Given these assumptions, the 
structural equation model becomes the following. 

V 

. .-v =• x + • (7) 



Analysis and Discussion of Mpdel s 1 and 2 

Table 4A pres^|^s estimates for Model 1 and Model 2., There <af£ dramatic 
differences in the estimates obtained in magnitude and sign. Note also, that 
with the exception of the effect of L, in equation 2 of Mpdel 2, most effects 
are small, in Model I, the gain score model, eleven independent ya<r;iat)les 
explain just over 2.5 percent of the gain score variance. With effects of 
this .Tiagn^tude it is hopeless to draw substanti ve xoncl usj^ins of consequence. 

However, in order to* increase ef f ici ency* and remove clutter, a second set 
of esti/na'tes (shown in Table 4B), were calculated, fixing insignificant effects 
at 0. In terms of goodness of fit arnd 'expl a i ned variance, dropping insignifi- 
cant effects had little adverse consequence,' indicating that information lx)ss 
was trivial. At the same time, the reduced complexity of the models makes them 
easier to compare. 

estimates were prepared, suppressing to 0 the effects of xi , Xs and X7 
in equation 2. In this set of estimates, the effect of X s in equation 1 



TABLE ,4A 

LISREL ESTIMATES OF MODEL 1 AND MODEL 2 



• 


Mnnr i 


i 

. i 


Dependent Variables 

Mnnn 
nUUtl 

• 




* 


T nHononHarif' 
i 1 1 Uc pcilUc II L 

Variables 


T : - 


T v 






I 


A , 
i 


77 7 
• 0/ J 


( 07/1 ^ 




. b/y 




v 
\ 




t . u^u ; 


in A A K * f H£ 9 \ 


4.183 


(.024)' 




- • Hi) / 


( nwO 
^ - • vooj 


1 


.434* 


(.066) ^ 


v 
A 

n 


Ail 7^7 


( i i \\ 


1 yU . oho \ • ODD ^ 


63.114* 


(.113) 




31 . Ill 


.(.0B1) 


- 24.782*(-.050) 


3.733 


(.007) 




.635* 


.(.046) 


• 


1.064* 


(.056) 




-3.829* 


(.028) 


• 


-4.075 


(-.022) 


X 

a 


1 803*" 


f 01?) 
^ • U1 £ y 




*'l4. 180 *;*(-. 068) . 


V 


# .062 . 


(.071) 


• * 


.123* 


(.102) 




-.625* 


(-.006) 




12.267* 


(.085) ' ■ 


X 

1 1 


*.010* 


(.000) 


. ■ / 


-3.642* 
.653* 


(-.116) 
(.610) 


(i - n . 


.0264 




.195 


.469 




> : /d.f. 


. 0003/0 


. 277.-81/8 







* Significance at less than .05 , 
Figures in parentheses are standard! zed estimates 
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was also, found to % be rionsiyni ticant , so i second^ set ot estimates .were ^ 
prepared, setting at 0, X' s in equation 1. In this secbad set at estimates; 
trie x in equation I shitted just out of the critical region. 

First, the t gain score mode) yields quite a different picture than thfe 2 
wave model. The variables X, and X ' are non-significant in Model 1 and in 
equation 2 ot Model 2. X , the teachers attendance at outside conferences, is 
an aniDvyuous measure. It may measure level ot professional ihterest and' 
awareness, but it may also measure teacher absence from, the classroom,, or a* 
JeSirtf for^upward professional mobility , a .e. , to get out of the classroom* 

'X , Kindergarten , is an interesting variable since it is non-sigm f icant 
in Model 1 and in equatiun 2 of Model 2, but significant in equation 1 of 
Model 2. Kindergarten experience has an indirect effect which is missed alto- 
gether in the yam score model. 

The variable, X 5 , "Teacher experience, is sigriificant in Model 1 but 
.not in % Model 2. In other words, ^students of experienced teachers show more 
improvement than ' students of inexperienced teachers, but when we control for 
reading competence dt Time 1, the teacher experience makes no difference in 
reading competence at Time 2. # The effect of" Model 1 could represent a dif- 
terence in assignment. It would seem reasonable that the school would take 

teacher experience into account in making classroom assignments. 

t f » 

f-our teacher'and classroom variables, X , X , X , and X , are non- 

6 8 10 n 

significant in Model 1, but are significant in Model 2. It seems their 
effects snow when assignment is taken into account* , - 

Three remajning variables, X , X i , and X are significant in both* 
models. However, only X and X^ agree in both models. It is interesting to 
note that X and X , along with X , are the only independent variables 

1 9 • ? 
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TABLE 4B 



LISREL -ESTIMATES OF MODEL 1 AND MODEL 2 
WITH INSIGNIFICANT PATHS SUPPRESSED 



.MODEL 1 

Independent , 
Variable T - T, 



Dependent Variable . 

MODEL t 

T. 



X 

1 


1 O *5 

. 383 


(.075) 


2 \ 


Of 


0 + 


x - ) 


-*344 


(-.071) 


u 


-4.0.351 . 


(-.100)" 


X 

5 


32.349 


(.084) 


x^ .' 


0 T 




X 

7 


O' 1 


. o'-' 


X 

8 


o r 


0 + 


X 

9 


.067 


(.076) 


X 

10 




0 T 


Xy 

n 


ot 


ot 


f 







»''.4«9 (.067) 
10.999 (.065) ' c * Ot, O.t • 

• ^ ,'.303 (.045)(not significant) 
198,813 (.381) 64.051 (.115) 1 



O 1 



0 



t 



0' 



.860 (.045) ' 
\ V 0 + 

-8.139 (-.039) 

.102 ' {.085) 

7.604 (!052) 

-2.065 (-.065)" 

.680* C-635) 



1 



xvd.f. 



.024 
4.407/6 



.195 
272.712/12 



.463 



1 Fixed at 0 - 

Fijgures in parentheses are Standardized estimates 
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measured at the' student level Alh others ^re at the classroom and school 
level. The interpretation or X * student attendance, and X g , time in the 
classroom spent readme) independently, is' straight forward. Students who come 
to school more, and spend more time reading while at school, can read better 
at the end of the year. Not very, profound ^but . a finding nontheless. 

Ine variable's, X and X , suppl emerftary staff and proportion t>f high 
scoring students in the school, share a positive sign in Model 2 'and a nega- 
tive sign in Model 1. Looking first at -X.,'-, students in. schools including a _ 
nigh proportion of 'hi$h scoring studenh do not improve as much as students l-rr 
schools wvth a lower proportion of high scoring students, but when reading" 
' level at time'l is controlled, students in schools having a hi gh" proporti on of 
high scoring students score higjier at time 2. " 

- This \s a complex association. The variables, X 3 and X„ are 'highly 
correlated rtegattyely, -.626. Considering just Model 2, they have opposite 
signed correlations with the dependent variable, but their effects in Model 2 

> « 

nave tne same sign. Substantively, it seems that X„ is measuring the level 
of reading competence in the school. It is arguable that what is being 
measured is trie 'socioeconomic level of the school,; middle and upper middle 
class students tend to have higher levels of scholastic success than working 
and'lower class students. . 

in. either case,. Model 2 suggests ' that since supplemental staff persons 
' ore assigned on the basis of need,- schools with low general levels of coin- ■ 
petence will receive more staffing resources, accounting for the high negative 
correlation between X., and X, . One could call' this an allocation effect a 
Consequently, X 3 "has a negative correlation with T ? , , because of this alloca- 



tion effect; but when X^ and X* are entered in*the sape equation, the partial 
effect of X i is positive, suggesting that when the allocation effect of 
staffing is controlled, the effect on/heading levels is poffttive., 

In Model. 1 the effects of both X^ and X 3 on the gain sc|re are negative, 
Jeadimj to the conclusion that supplementary staffing has a detrimental 
influence. Instead, what apparently, is operating is a negativevassociation 

between gainfand initial competence level. Low Students have hiper gains, 

* % 

pernaps because of a ceTl.ing'ef feet", i.e., less room for Jmprovemerii.and perhaps 
a Wo because of the allocation effect of supplementary staff, i.e. > Ifess 
concentrated instruction. % 

. 

The f&regoing interpretation seems satisfactory except that it is '* \} 
contradicted by the effect of X . The variable Xn, time in classroom of- 
reading aides, should be expected to parallel the effects Of X^ ,' to the ' v 
extent that both measure supplementary staffing. The only major difference 
between the two is that Xu is measured at the classroom level. There 'are- two 
possible interpretations: first, it may be that a cbl 1 i neaFj ty effect is 
distorting the effect of X ,\since it is correlated with both X , .527, 
and J* , -.427. This may'also explain'why X n becomes nonsignificant (see 
Table 4B). Second, it may be that there is a true negative component in*Xn . 
For example, it has been suggested (Conant, 1971) that classroom aides may be, 
misused, supplanting rather than supplementing instruction from more highly* 
trained and qualified teachers. In other-words, teachers who make the highest 
use of aides may be over-relying on aides. ^ C 

In summary, Mgdel 2 tends to produce a pattern of effects Which come 
closer to matching reasonable expectations about .reading achievement. 
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um^ibtentljk reversals in signs of efte ts of independent variables suggest 
that perrori.idnce of a particular student depends largely on that student's 
*tjrtinij point, bu that when a student's starting point is taken into account 
a clearer picture lsobtained ot the factors contn buti ruj to His or her 
, progress . 

Model l accounts tor approximately 20 percent of the Time "1 variance and 

4d percent oNthe Time 2 variance.- Th\s is an improvement over the rniniscuVe 

diiount or in Score variance accounted tor by Model l. At^the same time it 

'uust be emphasized that effects are small rn both models and 'may be, although 

statistically significant, substantively trivial. For example, Mod-el* 2 indi- 

cates that each day ot" absence from the classroom results in an expected loss 

ot" naif a point on the CAT reading achievement test, when the standard 

deviation of that test is 72.56. Model "2 'also indicates that each minute 
»* v 

•spent per week at independent reading results in an expected increase of one- 

* ** 
tenth point on the CAT (.102). 'Increasing that time by an tiotlr pfer week would 

4 * 

amount to a six point improvement, not a very large payoff. 

These finuinys must be viewed in the context of speci f icatioo , We have 

seen now strorujly the sign and magnitude ot effects can be altered when new 

infcur. idtiurt' is added. 1 he additidn or other variables would- probably *al ter 

f 

estimate*. The low percentage of variance explained suggests that there must 4 
oe utuer ijdjur lntluerices on reading abilities which have. not been taken into 
co-'isiuerat'i on . K , 
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A STRUCTURAL EQUATION MODEL WITH NON-ZERO ERROR ASSUMPTIONS 



Model 2 leaves a major substantive issue, unresolved concerning the lack 
of symmetry of the effects of X 3 and X u . Both relate to staffing variables 
and should havej>aral lei effects. Three highly intercorre 1 ated vapf^blefl X 3 
X u an^X u , were analyzed using confirmatory factor analyses to explore an 
exhaustive range of factor structures, T\ two- 'factor structure as in Model 3 
produced the most satisfactory fit. 

Initially, specification of the measurement model attempted to load aj 1 
three variables onto on£ factor. These attempts resulted in unsatisfactory 
goodness of fi t di agnostics , leading to the present loading of three variables 
on two factors. Identification of these correlated disturbances was 
accomplished by placing arbitrary constraints on Phi, the matrix of correla- 
tions a^oay factors. Model 3 had excellent goodness of fit Indicators, e.g. a 
probability level of .15. The goodness of fit was also improved in increments 
«. by specifying small, correlated error terms among independent variables. 
The Model 3 structure suggests ^th^t X 3 and X n have different but 
overlapping underlying effects. The association of X.* wtth both factors 
indicates that the allocation effect of % supplementary staffing discussed in 
the-previpus section, applies to both staffing variables. The correlation 
between the two factors is .754, indicating that while there, is substantial 
overlap, each factor is to a degree unique. If the lack of symmetfy between 
X 4 and X u had been due to a distortion from col 1 ineari ty , it should have 
'been possible to load all three variables on one factor. The Vesi stance of 
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the covar'iance' structure to a single factor structure is* taken as empirical 
evidence that lack of* symmetry rs not- duetto col 1 i neari ty , but that, the, 
effect of-X , is indeed unique from that of X . • 

Employment of confirmatory factor analysis in thjs way creates a 
"measurement model" relating all eleven independent variables to ten' inde- 
pendent factors. • . - , , . . 

The measurement model hypothesizes- an observation: to be composed of a 

< 

true value attributable to the thing observed, and measurement error as 
described earlier in equation 1, X ■ £ x \ + J where X is a matrix of , 
independent observed var-iables; A is a matrix of factor', loadings showing' 

X , <* 

how much of each variables 1 variance loads on each factor; *• is a 
variance-covariance matrix, of the unobserved independent factors, and 5 is a 
matrix of the errors in- measurement. 

Equation (8) shows the unstated independent measurement model of a 
typical multiple regression approach. * y 

■ . ' V - > x - C (8) 

Since the -errors are assumed tp be zero, 6 drops out of equation (1). 
Since multiple regression programs usually assume thd number of variables 
equals the number of factors, -X is an identity matrix. Thus \ is shown 
to »be equal to the variance-covariance matrix of the independent variables. 

Models estimating correlated measurement 'errc>r are useful especially 
given a set of theoretically Jirfted and colj^near mul tiple-ihdicators.* The 
logic of procedures used "to Obtain model^withVealistic'error .estimates can 

v- I « 

also be invented Joy ftx-mg measurement' error along a range of hypothetical 
levels in-order to observe the sensitivity of other parameters to measurement 
errors. Examples. of this will be shown, later in the paper. 



Discussion o'f Model 4 

Model 4 includes the independent measurement model shown in Model 3 with the 
exception that the links among the factors shown in M^del 3 are not shown in 
Model 4 to simplify its presentation. The full model; as described in « 
equation 3 and exemplified in Model 4, links the independent measurement model 
to the dependent measurement model. 

"Five additional matrices are now used. One of them,' gamma. ( r), links the 
independent and dependent models; Three matrices, lambda (A ), beta (6 ), and \ 

T 

_tn£ta epsilon (;>)> describe the dependent measurement model and one matrix, J 
ps i gives the errors ma.de in predicting values of the dependent factors. 

c Row 1 of gamma represents the" effects of the independent factors on Time 
1, and row two is their effects on Time 2. Three of the eleven independent 
variables were hypothesized to have an effect on the time one reading score, 
but one of' those loaded on two factors. Consequently, four of 'the ten parame- 
ters in row 1 wgre estimated, and the remaining six were, assumed to be zero. 
Since all eleven variables, * and theref ore'°al 1 ten factors are hypothesized to 
affect the Time 2 reading score, all ten gamma coefficients were estimated in 
row 2. * , 

Beta is a 2x2 square matrix of coefficients relating the two dependent 
factors. The diagonal elements represent the effect of each factor on itself. ■ 

represents the effect of Time 2 on Time 1, and is", ' therefore assumed to 
be 0. >Vi represents the effect of Time 1 on Time 2*, and is estimated. 

Psi is a 2x2 symmetric matrix of errors in equations, the diagonal ele- 
ments represent the unexplained variance in each dependent factor after all 

» 
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trie independent variables nave theiV effect calculated- The sincfle off 
diagonal element, x'^ , is tne correlation between the unexplained variance 
#*f eacn factor.. This is usually assumed to, be zero, however/ it it is 
reasonable to' assume that omitted variables are opening (Zellner 1963), 
then* r j will not equal zero and will upwardly bias the estimation of beta. 

Goo dness of Fit and Longitudinal Assumptions 

'The LISREL IV program provides an * measure of goodness of fit, as an 
indicator of distortion introduced by over-identifying restrictions. However, 



tne «* value is misleading for large samples becau^\its magnitude is a func- 
tion of sample size. There is a serious danger of over-fit when restrictions 
too di ,1 ig.ent.ly refine a model to minimize x . 

bbme researchers prefer tne Tucker-Lewis statistic which is based on the 
ratio of the siyma matrix to the covariance matrix, and is therefore indepen- 
dent of sample size (see Tucker and Lewis, 1973; Knoke , 1979). 

In Model '3, the X 2 value obtained was quite low, 20.79 with 15 degrees of 
freedom and a probability of Occurrence of J43> This is yery low and may 
suggest overfit. However.^in Moder n, the X value takes a large step upward, 
to 21^.34. This suggests that over identifying restrictions in the gamma or 3 
beta matrices are at fault. .Freeing all parameters in the ^gamma matrix results 
in an lvalue of 43. 73 indicating that over-ietenti fying restrictions in 
equation 1 are responsible for most of the distortion picked up by * . 

2 

It is not important 'to minimize x , however it is important to balance 

goodness of fit against over-identi fying restrictions. In 'this case it would 

1 

appear that the structure of the model doe* not allow for a* cr*oss -legged effect 
ot reading achievement at Time 1-on classroom assignment in the coming year. 



• 26- 



'Si 



However, these effects, when- estimated, are trivial in magnitude. 
Consequently, lt^eems unwarranted to revise the model in order to incorporate 
them. 



Comparison of Models 2 and 4 

Table 6 compares estimates obtained from Model 2, to tho$e obtained from 
Mode-1 4. The more complex factor structure has increased the amount, of 
variance explained in the Time 1 and Time 2 variables. Of major^ interest are 
the effects of the two factors associated with X 3 , X 4 , and X n . Factor 1, 
which is influenced by X, but not X n , has a pos it i ver. effect greater than the 
effect of either X, or X, in Model 2. Factor 2, influenced by \ x but' not 
X, has. a very large negati ve 'effect -which is considerably larger than the 
negative effect of X n in Model 2. These changes are also reflected in a*large- 
change in the proportion of variance explained in fogrth grade reading, .46 in 
Model 2, and .55 in Model 4! These results indicate that there may well be a 
detrimental effect resulting from inappropriate use of reading aides. , 

c 

There are other notable changes in estimates. Teacher attendance, X 6 ,^ 
has a small but significant' effect in Model 2 but an insignificant effect in 
Model 4. Teacher's attendance at outsi.de conferences, X 7 , is non-significant 
in Model 2, but has a small, significant effect in Model 4. Teachers approval, 
of the "reading program,, X w , js significant in Model" 2 but not in Model 4. 
Since these are substantively miniscule effects in either' case, one hesitates 
to draw conclusions, s but it fs consistent to suggest that a teacher's pe'rfor- 
mance may have a great deal to do with his or her effectiveness in using 
aides. By drawing out the effect of over-reliance on aides, effect's of other 
teacher and classroom variables were bound to shift. 

* * < 
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TABLE 5 



COMPARISON OF MODEL 2 WITH MOD, EL 4 



Model 2 
EQ.l EQ.2 



.06.2 



.366, 



.050 



.083* 

.024 

.066* 

.112* 

-.116* 
.007 
.056* 
* -.022* 

-.'068* 
.102* 
,.085* 
.610* 



,195 .469 
277 .81/8 , 



"bi-jm ficance less than .05 

A v . 1 figures are standardized estimates 



Model 4 
EQ.l * EQ.2 



.023* 



.311' 



.737' 



,125* 



.044* 
y012 

.22?* • 

-.396* 

-.012 
.020 
.060* 

-.043* 
.111* 
.025 
. 553* 



,31 .55 
212.338/18 
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- THE ANALYSIS OF MEASUREMENT AND SPECIFICATION' ERROR 

{ 

To this point we have consi dered* thre£ different models. Regardless of 
which is considered the best, our. tinkering has resulted in changes in sizes 
and magnitudes of effect coefficients, and also led to changes in 

interpretation* The kinds of* applied policy recommendations derived would 

t l , * * j v 

.legend on which model was 'considered, c , '' y 

nowever, specification of the causal arrangement of variables is-only one 
of nany cnoices that can influence the magnitude of effect coefficients. A 
/lajor- area of skepticism concerning multivariate models is in regard to the 
disturbances on error terms* 

Two major classes of <error concern multivariate models: errors in 
vanaoles and errors in equations* Errors in variables generally refers tq 
validity and reliability of observation* Errors in equations, or spec^fica- e 
tion error, refers. to the empirical adequacy of, a model. Omission of a 
variable v^j^h should b^ included, would result in specification error, as 
would treatment of a non-linear relationship as linear. 

It is difficult to §ssure the elimination of error. , However, as a 
safeguard, it is possible to examine. the behavior of parameter estimates under, 
hypptnetical error conditions. Land and Felson (1978) describe a, set of tech- 
niques which they refer to as sensitivity analysis. Applied to analysis of - 
error, one of their suggestions is to hypothesize a range of error conditions, 
and examine the consequences of thos'e error conditions on parameter estimates. 

As described above, tyoical identification "of two-wave one-variable 
models assume ryT'errors exist in the measurement of variables, no error 



covariances exist and the disturbances ot the equations are uncorrected. 
*>nmst*dt and Carter's (1971) and Alwin'and Jackson's (1980) admonitions V 
that path analysis is based on very restrictive assumptions and that such 
assumptions reflect 'a blatant unconcern with measurement error problems are 

also pertinent < here. 

Tne error assumptions typically made in order to identify two-wave 
one-variable models are unrealistic. For example, in the Philadelphia data 
it is reasonable to assume there .is some measurement error in- life measurement 
•or tmrd and fourth grade readiny achievement. Since^a similar measuring 
•instrument was, used at 'both times, it may be reasonable to hypothesize corre- 
lated measurement error (see HI fey and W,ley., 1974). -The Phi ladel phia- analy- 
sis implicitly assumes the gam score was without error. Moreover, their 
independent variables probably contain, some measurement error. 

Specification error, due to the omission of independent variables, is. also 
a major concern, especially because the Phi ladel phia .study collected but did 
not report on' variables over which the" school had no control ; e.g., race, sex, 
•and socioeconomic 'background. The' absence of these variables probably creates 
, an upward bias, overstating the effect of Time 1 on Time 2 reading 
acm'evement. Variables autocorrel ated over time tend to be so because the 
sane set of independent variables tend to be operative'at both times, such 
tnat exclusion ot an |nfportant independent variable results in a spurious 

serial effect. \ 

If tandem measurement error is present in the dependent variables, sup- 
pression on the 'longitudinal effect should be anticipated. On the other 
nana, specification errir should lead to an upward bias in the longitudinal 
effect. Using LISREL, it is possible to deal with these separately. The 
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matrix e >; , a 2 x 2 symmetric matrix, represents errors in dependent 
variables. >s n v represents error in % x , 0^2 error in ^2 and 0e 2 i 
is the correlation between the errors, in this case perhaps a test-retest 
bias. To examine the effect of measurement error on & 1 the elements 9c n and 
"'-2 we're fixed at 0%,.05 and .10 of the variance of their respective y's. 
Examining all comtnnati ons , nine hypothetical measurement error conditions 
wero examined, as shown in Figure 1. * 



AMOUNT OF ERROR 
IN y,. 



AMOUNT OP-ERROR* IN y 

. j 





0 




.05 


.10 




o, 


'0 


(3, ,.05 


0, .10 


.05 


.05, 


0 


, .05, .05 


.05, .10 


.10 


.10, 


0 


.10, .05 


.10-, .10 






I 


<> 

FIGURE 1 


r 



The matrix ' is also a 2 x 2 symmetric matrix of errors in equations 
where <|< n represents the variance of Eta , ; the unexplained variance in 
equation 1, and ^ , the variance of Eta 2 s 'the unexplained variance in. 
equation 2. If specification error t5 present the residuals of the two itruc 
tural equations will be correlated, perhaps reflecting the presence of - 
. . omitted var tables. 




To investigate the potential for- specification error to inf luenc'e ^21^ Y 2 i 
was fixed at 0, .05, .10 and .15. A sensitivity analysis was performed on the 
data (Land and Felson, 1978: 289). Each of the four levels of specification 
error wa's^applied to each combination of measurement error, resulting in 36 
error models; • 

Results of the Sensitivity Analysts 

' Table 6 shows the results of analyzing the thirty-six models containing 
combinations of error levels in Yi, Y2 and ^2 1 . 

first, that the inclusion of these error estimates did not disturb the 

•"•jk* > K *r y i-iff '~ 2 * 

yoodness of fit of the basic model, as X remains constant throughout the 
table. Second, there are trends with changes in measurement error. Third, 
there'is yet another set of trends to be interpreted in relation to changes .in 
specification error, y ?1 . 

These* are maximum likelihood estimates but these trends are identical to 
what would be expected from least-squares estimates, given an assumption from 
measurement theory that for large samples, the estimated variance of a variable 
.iveusured with error will always be higher than -the true variance, but that in 
cross product expressions, random error will tjsnd to cancel, resulting in 
unbiased cross product estimates (Siegel and Hodge, 1968). 

Tabl£ 6 shows three trends relating to measurement error. First,* the 
^standardized Beta estimates increase with removal of error from Yj, but 
remain constant w'ith removal of error from Y 2 . In other words, measure- 
ment error in the antecedent time 1 variable will rersult in downward bias, in 
the unstandardized beta. 



TABLE 6 

/ 

RESULTING PARAMETER ESTIMATES 
GIVEN ERROR ASSUMPTIONS IN Y, , Y, , AND H' 2 , 
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.JO 
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i034 


.330 
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.15 
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.034 


.307 
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.05 


* 10 


.15 


.405 


.309 


.035 


.370 


212.3 


.10 


.10 


.15 


.447 


.418 


.036 


.349 


212.3 
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Second, the value of the standardi zeJ beta estimate increases with removal 
^ of error from either -Y, or Y, . This can be explained by recal 1 ing Jthat 
. unstandardized and standardized estimates ar'e related by equation (9), where 
b <= standardized beta, 3 - unstandardi zed beta, and S = standard deviation: 



b = 3 yi (9) 
s 



If re"unns constant, ana error is removed from Y , S Y . will snift downwara. 
reLijUinnij that b suitt upwdrj. Consequently, v/ith Sy x and 3 constant, stan- 
jarcuej estimates will be downwardly biased by random measurement error in 

imr'J, Table 6 also shows the stancard error of Beta • to ' increase with 
removal of e£ror from either the independent "or the dependent vanaole, 
despite the terrrrtmt^fofe^ of 3 to increase. Ordinarily, one would, 

expect : :* to ti\?c;re$$e as 3 increases-. - A btfThersom'e aspect of measurement 
error ^z that it produces downwardly biased estimates of the standard error of 
beta, even given the downward bias of . Jlote, for example, that" where 

o^or in Y, - 0, Y. - 0, 0; we find 3 = .592 ando, = .029, while wnere 

•« — « * 

error-yin Y, = .10, Y, = .10, ? 2 , ■ 0; we find 3. = .694 and° 3 = '.032; the 

ir:\ Q'rror increases' despite ap increase in 3 . ' 

In this example, the bias in the standard error is not critical only 

^6cause* the sample size is larye, 1363. In smaller samples, this bias could 




'jjsKv resui-t in an -inappropriate inference. 
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In addition to an analysis of measurement error, Table 6 contains an 
analysis of specification error* The matrix ^ contai rfs , in addition to the 
unexplained variances of Hi and H2 a cross prodl^t term, *2i, representing 
the covariance between niandna . The explained variance in Vis 
accounted for by gamma estimates in equation 2, representing effects of the 
independent variables, and by" t , ' representi ng .the effect 'of Hi, the grade 3 
readi nq scores* 

^ represents a^peculiar effect because, unlike the independent variables, 
yi , a test score, is not a measure of an attribute of a causal agent, but an 
indicator of the student's previous accompl i shment. It is arguable that<6 
,does not represent a causal influence at -all, but a spuri ou*s Yesul t of the 
influence of unobserved variables operating .^fe^oth Time 1- and Time 2-. 
Consequently , the cov^riancei between Hi and n 2 which is accounted for by 8 , . 
could just as well be allocated to l h\ . ■ \ . 

„ - The consequences of adding increasing proportions 'of covaViance to ¥21 is. 

* , ' * 

summarized in Table 6. Note that as the krror in ¥21 increases 3 decreases, 

and as the standard error of 6 increases the unexplained variance \r\r\ 2 ^ 

Increases, and this trend hold regardless of measurement error in yi and yi 

This analytical procedure shows that problems associated wtth measurement 

error cap be studied by setting bounds on parameter estimates undera range of 

realistic error conditions. In the.*aoalysi s* of measurement erro/, 6 ranged 

from a low of .553 to a high of .648, with a standard error from .029 to .032 

under what actually are optimistic assumptions about the low magnitudes, of 

measurement error. 



Results of Testincj a,>Mode1 with Realistic Error Assumption 

What are realistic error assumptions to use in analyzing the Philadelphia 
data? Reliability coefficients between alternate forms of the same test are 
characteristically in the >a,nge of .85 to .94 (Thorndike and Hagen, 1977: 
92), suggesting that measurement ^error in achievement tests could realisti- 
cally be s'et at 25%. A separate analysis was done setting measurement error 

at .25 on both Y and Y , and including a 5% correlation in 8e 2 i to allow test- 
s' \ 2 

retest bias. Additionally, the off diagonal psy, Vz\ , was set at .05 to 
allow for specification errors due to omitted variables. The resulting 8 was 
• 788, • the '^g was .044 , and ^22 was ,144- 

Including higher rates of error in the longitudinal variables had the 
effect; of removing a downward bias in -the beta linking Time 1 and Time 2. An 
increase in beta occurred even though -the setting of ^21 at .05 removed an 
upward bias in beta due to* specification error. 



r 
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CONCLUDING COMMENTS 

The original report., What Works in Readi ng ? (Kean et al , 1979) bases 
. conclusions on questionable methodological practices. Among these are the 
procedure by which eighteen "significant" independent variables (out of 245) 

were selected, the uncritical use of gain scores, and disregard for problems 

• j 

of measurement error. 

,„> 

While it appears futile to salvage substantive conclusions, important 
metnodological lessons can be drawn from the secondary analysis above. 

The procedure by which independent variables were selected reflected a 
lack of theoretical guidance although theoretical guidance is^ essential in 
multivariate analysis. With a large number of variables, statistical signi- 
ficance is not a useful criterion. At the 95% confidence level, 12^ zero- 
order associations can be expected to be "significant" by chance. Here we 
do not begin to count an astronomical number of partials, of which five per- 
cent wi,ll also be large enough to- pass a significance test by chance alone. 
One quasi -theoretical decision guiding the analysis was the decision to drop 
variables outside the control of the school district, including race, socioeco- 
nomic status and sex. This ill advised step appeared motivated by ah understan 
dable desire to.nrinimlze controversy, but it was counter productive to an 
.understanding of the data base. 

•The secondary analysis reported here enjtailed successive refinements. 
. This^is not to say that other approaches would not be equally appropriate.. 

For example, the gain score model coi^j also be refined by "resi dualizing' 1 

* * ^» 

the* gain score variable, as recommended by liohrnstedt (1969). 



The gain score model (Model 1) yields results which are virtually 
uninterpretable. Effect's contradict long standing principals of educational 
practice. The longitudinal model (Model 2) results in dramatic changes in 
magnitude and sign of effects in contrast to Model 1. Effects in Model 2 are 
also more in agreement with pri'or expectations (see Rankin, 1980). 

Subsequent refinement?, including the introduction of a 10 factor measure- 
ment model on the independent side (Model 3), and the analysis of sensitivity 
to measurement error on the dependent side (Model 4), do not result in dramatic 
shifts in parameter estimates, but they do illustrate techniques which can and 
should be applied as new generations of software make them not only practical, 
but easily accessable to sociological and educational researchers everywhere. 
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